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An efficient implementation of tlie nonequifibrium Green function (NEGF) metfiod combined with 
tfie density functionaf theory (DFT) using locafized pseudo-atomic orbitafs (PAOs) is presented for 
efectronic transport calculations of a system connected with two leads under a finite bias voltage. 
In the implementation, accurate and efficient methods are developed especially for evaluation of the 
density matrix and treatment of boundaries between the scattering region and the leads. Equilibrium 
and nonequilibrium contributions in the density matrix are evaluated with very high precision by a 
contour integration with a continued fraction representation of the Fermi-Dirac function and by a 
simple quadrature on the real axis with a small imaginary part, respectively. The Hartree potential 
is computed efficiently by a combination of the two dimensional fast Fourier transform (FFT) and 
a finite difference method, and the charge density near the boundaries is constructed with a careful 
treatment to avoid the spurious scattering at the boundaries. The efficiency of the implementation is 
demonstrated by rapid convergence properties of the density matrix. In addition, as an illustration, 
our method is applied for zigzag graphene nanoribbons, a Fe/MgO/Fe tunneling junction, and a 
LaMnOs/SrMnOs superlattice, demonstrating its applicability to a wide variety of systems. 

PACS numbers: 71.15.-m, 72.10.-d, 73.63.-b 



I. INTRODUCTION 



The non-equilibrium Green function (NEGF) 
methodi'Si'^'^i^'^ potentially has several advantages 
to investigate electronic transport properties of 
nano-scale materials such as single molecules j^i^ 
atomic wires j^ii^^ carbon based materials jiiii^ and thin 
layers F^^ii^ The potential advantages are summarized 
by the following features of the NEGF method: (i) 
the source and drain contacts are treated based on 
the same theoretical framework as for the scattering 
regioni^i^ (ii) the electronic structure of the scatter- 
ing region under a finite source-drain bias voltage is 
self-consistently determined by combining with first 
principle electronic structure calculation methods 
such as the density functional theory (DFT) and the 
Hartree-Fock (HE) method^>i^^>20^>22^>24^^^ 
(iii) many body effects in the transport properties, 
e.g., electron-phono n^^i^^'^^i'^^'^^i'^^ and electron-electron 
interactions, might be included through self- 
energies without largely deviating the theoretical 
framework, (iv) its applicability to large-scale systems 
can be anticipated, since the NEGF method relies 
practically on the locality of basis functions in real 
space, resulting in computations for sparse matrices?^ 
Due to those potential advantages, recently several 



groups have implemented the NEGF method coupled 
with the DFT or HF method using atomic-type or 
the other local basis functions with successful appli- 
cations for calculations of the electronic transport 
properties 
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However, a highly accurate and efficient implementa- 
tion method must be still developed from the following 
two reasons: The first obvious reason is to extend the ap- 
plicability of the NEGF method to large-scale systems. 
The efficient implementation might lead to more chal- 
lenging applications of the NEGF method to very large- 
scale complicated systems. The majority part in the com- 
putational effort of the NEGF method mainly comes from 
the evaluation of the density matrix which is decomposed 
into the evaluation of Green functions and numerical in- 
tegrations. Thus, the efficient calculation of the part is 
a key factor for extending the applicability to large-scale 
systems. Nevertheless, accurate and efficient methods for 
evaluating density matrix within the NEGF method have 
not been fully developed, although several methods have 
been already proposed J^'i^'^ To extend the applicability 
of the NEGF method to large-scale systems, a remark- 
ably efficient method that we have recently developed'*^ 
will be applied for the problem in this study, and we will 
show that the new method is much faster than the other 
methodfiSi The second reason is that spurious scattering 
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should be negligible when the NEGF method is extended 
to include the many body effects beyond the one parti- 
cle picturc-^i^ai^i^i^i^^iMiS^i^^.^ The spurious scatter- 
ing accompanied by the inaccurate implementation might 
make the many body effects indistinct in the electronic 
transport properties. One can imagine that the spurious 
scattering can be easily produced in the NEGF method, 
since unlike the conventional band structure calculations 
NEGF has to be evaluated by a patch work that the self 
consistent field (SCF) calculations of the source and drain 
leads are performed beforehand, and the calculated re- 
sults are incorporated in the NEGF calculations through 
the self energy and the boundary conditions between the 
scattering region and leads. Therefore, a careful treat- 
ment to handle the boundary conditions should be de- 
veloped to avoid the spurious scattering. 

In this paper, to address the above two issues we 
present an accurate and efhcient implementation of the 
NEGF method, in combination with DFT using pseudo- 
atomic orbitals (PAOs) and pseudopotentials, using a 
contour integration method which is based on a contin- 
ued fraction representation of the Fermi-Dirac function. 
For the accurate treatment of the boundary conditions 
between the scattering region and leads, we also develop 
a method for calculating the Hartree potential by a com- 
bination of the two dimensional fast Fourier transform 
(FFT) and a finite difference method so that the bound- 
ary condition can be correctly reproduced. In addition, 
we discuss a careful treatment to construct the charge 
density near the boundaries. The efficiency and accu- 
racy of our implementation are demonstrated by several 
numerical test calculations on convergence of the density 
matrix. 

This paper is organized as follows: In Sec. II the de- 
tails of our implementation for treating the equilibrium 
state of the scattering region are discussed by focusing 
on the evaluation of the equilibrium density matrix, the 
treatment for constructing the charge density near the 
boundaries, and an efficient method for calculating the 
Hartree potential. In Sec. Ill our implementation for the 
nonequilibrium state is described. In Sec. IV we demon- 
strate the accuracy and efficiency of the implementation 
by a series of numerical calculations and several applica- 
tions. In Sec. V our implementation of the NEGF method 
is summarized. 



II. EQUILIBRIUM STATE 

Since most of practical aspects in the implementation 
of the NEGF method coupled with DFT using the lo- 
calized PA09^2ii2iiii appear in the ground state calcula- 
tion of the system at equilibrium, we start our discussion 
from the electronic structure calculation of the equilib- 
rium ground state by using the Green function method 
with DFT. 
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FIG. 1: (a) Configuration of the system, treated by the NEGF 
method, with infinite left and right leads along the a-axis un- 
der a two dimensional periodic boundary condition on the 
bc-plane. (b) One dimensional system compacted from the 
configuration of (a) by considering the periodicity on the bc- 
plane, where the region C is an extended central region con- 
sisting of Co, I/O, and _Ro. 



A. Equilibrium Green function (EGF) 

Let us consider a system where one dimensional infi- 
nite cells are arranged with two dimensional periodicity 
as shown in Fig. 1. Throughout the paper, we assume 
that the electronic transport along the a-axis is of inter- 
est, and that the two dimensional periodicity spreads over 
the bc-plane. The one dimensional infinite cell consists 
of the central region denoted by Cq and the cells denoted 
by Li and Ri, where i = 0, 1, 2 • • • . All the cells Li and 
Ri, arranged semi-infinitely, contain the same number 
of atoms with the same structural configuration, respec- 
tively, but the cells Li and Ri can be different from each 
other. In the equilibrium state with a common chemical 
potential everywhere in the system, the electronic struc- 
ture of the system may be determined by DFT4Si^ Due 
to the periodicity of the bc-plane the one-particle Kohn- 
Sham (KS) wave function in the system is expressed by 
the Bloch function on the bc-plane using PAOs 4)ia lo- 
cated on site r,- as: 



1 y-^«k.R„y, 



,(k) 



,(r-T,;-R„), (I) 



where cr, i, and a are indices for the spin, eigenstate, 
site, and basis orbital, respectively. The lattice vector Rn 
and the Bloch wave vector k are given by Rn = Ibh + lcC, 
where b and c are the lattice vectors, and k = ki,h + kcC, 
where b and c are the reciprocal lattice vectors, respec- 
tively. The summation over i and a is considered for 
all the basis orbitals in the one dimensional infinite cell, 
which indicates no periodicity along the a-axis. Consid- 
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ering the variation of the total energy, within the con- 
ventional DFT, of the system expressed by the KS wave 
function Eq. (jlj with respect to coefRcients c we obtain 
the following KS matrix equation: 



ZT(k) (k) _ (k)(.(k) (k) 



(2) 



where c"^J is a column vector consisting of the coefBcients 
{c"^Jia}- The Hamiltonian and overlap matrices 

S^^^ are given by 



n 

c(k) _ V^„ik-R„„ 



(3) 
(4) 



where ha,iaji3,R„ and Siaj73,R„ are the Hamiltonian and 
overlap matrix elements between two basis functions 
(j^iaij ~ '''i) and 4>jp{Y — Tj — Rn), respectively. The over- 
lap matrix arises from the non-orthogonality of the PAO 
basis functionsj^i^i^ Now we consider an extended cen- 
tral region C composed of the regions Cq, Lq, and Rq as 
shown in Figs. 1 (a) and (b). The extension of the cen- 
tral region Cq is made so that the relaxation of electronic 
structure around the interfaces between the leads, Lq and 
Ro, and the central region Co can be allowed. In addi- 
tion, we impose two conditions: 



(i) The localized basis orbitals (j> in the region Cq over- 
lap with those in the regions Lq and Rq, but do not 
overlap with those in the regions Li and Ri. 

(ii) The localized basis orbitals (j> in the L.^ {Ri) region 
has no overlap with basis orbitals in the cells be- 
yond the nearest neighboring cells Li_i and 



written by a block tridiagonal form as follows: 
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cr,Li 



H 
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(k) 

cr,RiC 



H. 



(k) 

0-,-R.i 



(5) 



where H^^l, H^^} , and H^^l are Hamiltonian matrices 
of the central C, left Li and right Ri regions of which 
matrix size are the same as the number of basis orbitals 
^C, Nl, and in the regions C, Li, and respec- 
tively. The other block components in Eq. ([5]) are the 
Hamiltonian matrices connecting two regions among the 
regions, and these matrix sizes are deduced from those of 
the two regions. Also the completely same structure is 
found in the overlap matrix. Thus, the electronic struc- 
ture of the system given by Fig. 1(a) can be obtained 
by solving the one dimensional block chain model, being 
k-dependent, given by Fig. 1(b) and the corresponding 
Eq. @. By noting G^a\z){ZS'^^^ - H^^^) = I and mak- 
ing use of the block tridiagonal form of the Hamiltonian 
and overlap matrices, the Green function of the central 
region C can be written by 



:.(k) 



H 



(k) 

cr,C 



.(k) 



(Z) 



.(k) 



iZ)) (6) 



with self energies T.'-^liZ) and Si'^]j(^) defined by 

si^i(z) = {zs^^i-H^::i,jx 

G^^L{Z){ZSfl-H^^l^^), (7) 



G'^n{Z){ZS'^}c-H'^lc)^ 



(8) 



In our implementation the basis functions are strictly lo- 
calized in real space because of the generation of basis 
orbitals by a confinement scheme Therefore, once 
the localized basis orbitals with specific cutoff radii are 
chosen for each region, the two conditions can be always 
satisfied by just adjusting the size of the unit cells for Li 
and Ri . This is a benefit in the use of the strictly local- 
ized basis orbitals compared to other local basis orbitals 
such as Slater- and Gaussian- type orbitals. In the use of 
the strictly localized basis orbitals, the eigenvalue prob- 
lem in the Hilbert space spanned by the basis orbitals 
are solved without introducing any cutoff scheme. On 
the other hand, in the Green function method the ma- 
trix elements of the Hamiltonian and overlap matrices 
have to be truncated so as to satisfy the above two con- 
ditions in case of the other localized basis orbitals with 
the small but long tail. With the above two conditions 
(i) and (ii), the Hamiltonian matrix given by Eq. ^ is 



where G]^ [{Z) and G^^(Z) are surface Green functions 
of the left and right regions. 

It is worth pointing out that there is a total energy 
functional which can be variationally minimized with re- 
spect to charge density n if Eq. ^ is self-consistently 
solved. The details of derivation for the functional is 
given in Appendix A. 



B. Surface Green function 

In general, the surface Green function G^j^siZ) is de- 
fined by Gct'^s(Z) = {ZSs — Hs)^^, where Ss and Hg are 
the Hamiltonian and overlap matrices for the lead re- 
gions, and the suffix, s, is L or R. It is noted that due 
to the two conditions (i) and (ii) mentioned above the 
Hamiltonian and overlap matrices for the lead regions 
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can be written by a block tridiagonal form as follows: 

/iln H^2 \ 

H21 H22 H23 



H32 H: 



V 

S21 



V 



5*12 
5*22 



33 



(9) 



32 



5'23 
5'33 



(10) 



where the Hamiltonian can be spin and k-dependent, 
while the indices for them are omitted for simplification 
of the notation, and also the index i appearing in Hij 
corresponds to the cell number for the lead Li or Ri. It 
seems to be difficult to directly diagonalize the KS equa- 
tion Eq. ^ for the one dimensional block chain model 
because of the infinite dimension of the matrices. How- 
ever, instead by focusing on only the central region one 
can evaluate the Green function of the central region as 
a rather small problem of Nc x Nc in size. The effect 
of semi infinite regions L and R are included through 
the corresponding self energies 'S^^l^{Z) and Y,''^j^{Z). In 
order to practically calculate the Green function of the 
central region given by Eq. ([6]) , we introduce an approxi- 
mation where the regions Li{i — 1,2, ■■ ■) are all equiva- 
lent to each other with respect to the spatial charge dis- 
tribution, the KS Hamiltonian, and the relevant density 
matrix which are calculated in advance by adopting the 
system of which unit cell is Li and by using the conven- 
tional band structure calculation. The same approxima- 
tion also applies for the regions Ri{i = 1, 2, • • • ). Strictly 
speaking, the assumption is not correct, since the charge 
distribution must be affected by the interaction between 
the central region C and the regions Li (Ri). However, 
if the size of the unit vector along the a-axis for the re- 
gions Lq and Ro in the extended central region C is large 
enough, the assumption will be asymptotically correct as 
the unit vector becomes larger. The approximation en- 
ables us to evaluate the surface Green function by the 
iterative method.— The efficient iterative scheme can be 
performed by the following procedure: 





= ^i ^Cli, 


(11) 
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(12) 


£5,4+1 




(13) 




^ €i — PiQi — aibi, 


(14) 


Oti+l 




(15) 


A+1 


= Pibi 


(16) 



with a set of initial values 



es,o = ZSii — Hii, 

eo = ZSii — Hii, 

ao = -{ZS12 - H12), 

Po = —{ZS21 — H2l). 



(17) 
(18) 
(19) 
(20) 



By the iterative calculation, in most cases the inverse 
of Csji rapidly converges at a part of the surface Green 
function: 



lim 



(21) 



where Gs,ii is the (1,1) block element of the surface 
Green function {ZSg — Hs)~^. The (1,1) block element 
Gs,ii of X (or Nji x Nn) in size has all the neces- 
sary information to calculate the self-energies Y,^^l^{Z) 

and S][^'']j(Z), since there is no contribution from the 
other block elements because of the two conditions (i) 
and (ii) mentioned above. In practice, the convergence 
in the iterative calculation is very fast and the Frobenius 
norm, defined by (^^ \ies,^+l)w ~ ie.,^)w\Y^^ of IQ-^ 
(eV) is obtained by typically only 7 iterations. 

C. Equilibrium density matrix 

One of practical difficulties in the implementa- 
tion of the Green function method is how the equi- 
librium density matrix is evaluated efficiently and 
accurately . ^'^i-'^^i'^^i^'^i'^^i^^i^^i^-'^ In our implementation, the 
equilibrium density matrix is highly efficiently computed 
using the contour integration method with a special 
treatment of the Fermi-Dirac function f.^^ If the Hamil- 
tonian and overlap matrices associated with Eq. ^ are 
k-dependent, it turns out that the spectrum function in 
the Lehmann representation of the central Green function 
is complex number in general as discussed in Appendix 
B of Ref.**^. Then, the density matrix p^^'^ , where one 
of the associated basis orbitals is in the central cell and 
the other is in the cell denoted by Rn, is given by making 
use of both the retarded and advanced Green functions 
G'-^liE + iO+) and g'-^1{E - iO+) as 



„(cq) 



1 



BZ 



p(k) ^g-,k.R„ 



with 



Pcr,± 



/oo 
dEG';;'l{E±tO+)f{E-f,), 
-00 



(22) 



(23) 



where Vc is the volume of the unit cell, Jg^ represents the 
integration over the first Brillouin zone, O"*" a positive 
infinitesimal, and fi a chemical potential. The integra- 
tion over k-space is numerically performed by using the 
Monkhorst-Pack meshiS It is also noted that the phase 
factor e"''''^" appears through Eq. ([T]). If the Hamilto- 
nian and overlap matrices are k- independent, Eq. 
can be simplified into a well-known formula: 



Pcr,0 



Im 



1 

— / dEG„,c{E + iQ+)f{E- fi) 
J-00 



(24) 



In case of the k-independent problem, the simplified for- 
mula is used, since the number of the evaluation of the 
Green function is reduced by half. 
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For the efficient integration in Eq. ((23)) . in our im- 
plementation the Fermi-Dirac function is expressed by a 
continued fraction representation derived from a hyper- 
geometric function-- so that the structure of poles can 
be suitable for the integration associated with the Green 
function as follows: 



1 



1 + cxp(a;) 



(1)2 



(i; 



of poles in the summation of Eq. p6)) required for the 
sufficient convergence depends on the electronic temper- 
ature T, the fully convergent result within double preci- 
sion is achieved by the use of only 100 poles in case of 
T = 600 K as shown later. 

If forces on atoms are calculated based on the con- 
ventional DFT scheme using the non-orthogonal basis 
orbitalsfi^, the evaluation of the energy density matrix 
Co- is needed. In Appendix B, we derive the calculation 
scheme of the equilibrium energy density matrix e^'^^ 
based on the contour integration method. 



(2M - 1) 



Rn 



P=l 



E 



R 



V 



X -\- iz. 



(25) 



■p 



where x = (3[z — with f3 — T is electronic tem- 
perature, z and X are complex variables. Also, Zp and Rp 
are the poles and the associated residues of the contin- 
ued fraction representation Eq. ([25)) . which are obtained 
via an eigenvalue problem derived from Eq. (|25p .^^ Since 
all the Zp are real numbers, the poles izp are located on 
the imaginary axis. One may find an interesting distri- 
bution of the poles on the complex plane that the in- 
terval between neighboring poles are uniformly located 
up to about 61 % of the total number of poles on the 
half complex plane with the same interval 27r, and from 
then onward it increases very rapidly as the distance be- 
tween the pole and the real axis increases. The struc- 
ture of the poles in Eq. (^5]) allows us to efficiently eval- 
uate Eq. ((23)) because of the asymptotic change, l/Z, 
of the Green function in the faraway region of the real 
axis. In other words, the denser poles are allocated for 
the rapidly varying range of the Green function, and the 
coarser for smoothly. In addition, there is no ambigu- 
ity in the choice of the path in the contour integration 
unlike the other schemes ^iii^i^ which is one of advan- 
tages in our method. By terminating the summation of 
Eq. (|25)) at a finite number of poles Np, the integration 
of Eq. d^Sj) can be performed by the contour integration 
where the poles in the upper and lower half planes are 
taken into account for the terms with plus and minus 
signs in Eq. ()23p . respectively, and the explicit formula 
is given by 



1 

4^ 



(k,0) 



=1 



(26) 



where au 



fi ± i- 



and ^i^J^'^"^ is the zeroth order mo- 



ment of the Green function G^'^i. The zeroth order mo- 

(T.O 

ment fi'^'^^ is easily calculated by ^("^'^^ — iR G^^lj{iR) 
which can be derived from the moment representation 
of the Green function;^ where i? is a large real number 
and in this study 10^° (eV) is used in order to make the 
higher order moments negligible. Although the number 



D. Charge density near the boundary 

Even though the basis functions we used are strictly 
localized in real space, there is the nonnegligible contri- 
bution for the charge density near the boundary between 
the central and lead regions from the basis functions lo- 
cated in the lead regions. Note that any treatment for 
the contribution to the charge density has not been clari- 
fied in the other implementations . -^^'^^i^^'^^i^^'^^i^^ Thus, 
we carefully calculate the charge density in the central 
region by considering three contributions: 



n,(r) = 4^^)(r) + 2n(f=)(r) + 4^^)(r), 



(27) 



where the suffix, s, is L or R, and ^^'^'^''(r), na^\v), and 
ni^^ (r) are the charge densities contributed from the ba- 
sis functions located in the central, the lead and central, 
and the lead regions, respectively. Note that the sum- 
mation over s is not required in Eq. ()27p because of the 
conditions (i) and (ii). Each charge contribution is ex- 
plicitly given by 



n id, if} 

x0„(r-rj)0j;3(r-T,- -R„), (28) 

n.n' iajf^ 

X(j)ia{r -n - (Rn ± a)) 
xc^jfi{r-T, - Rn'), (29) 

n.n' j73 

X(/'ja(r-Tj - (R„±a)) 
x0j;5(r-r, -(Rn'±a)), (30) 

where a is the lattice vector of the unit cell for the Lq or 
i?o region along the a-axis. The displacement of —a (-l-a) 
denotes that the basis function is placed in the Li (i?i) 
region in the configuration shown in Fig. 1. The charge 
density given by Eq. ()28p is calculated by the equilibrium 

density matrix p'^'f^ ^^-^^ given by Eq. (|^ . Although 
Eq. d^S]) can give a finite electron density on the outside 
of the central cell with R^ = because of the overlap of 
basis functions, the contribution is reflected in the central 
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cell with Rn = by considering the periodicity on the 
bc-plane. In the nonequilibrium case, the equilibrium 
density matrix is only replaced by the nonequilibrium 
density matrix which will be discussed in the next section. 
Each term in the summations for the two contributions 
na'^\r) and ^^^''(r) survive only if the overlap of the 
associated two basis orbitals is not zero in the central 
region. Since the original central region Co is extended 
by adding the Lq and Rq regions, it is expected that the 
density matrix elements p''^%r^^jPr^, and pJ1r„j73r„, 
in Eqs. ((29|) and pO|) are close to those of the leads in 
the equilibrium condition. Therefore, the density matrix 
elements of the leads calculated by the conventional band 
structure calculations are used for Eqs. (|^^ and ([50)1 . 

Due to the treatment the charge densities ^^'^''(r) and 
Uct (r) are independent of the SCF iteration so that for 
the computational efficiency they can be computed on a 
numerical mesh and stored before the SCF iteration. The 
factor 2 for ria^'^ (r) in Eq. (|27p is due to taking account of 
the contribution from na^\r), while the factor does not 

appear for n^^^ (r) , since all the paired terms are included 
by the double summation in Eq. ([50)1 . 

We add a note that the same consideration has to be 
applied even for the calculation of the density of states 
(DOS). In this case, the contribution from off-diagonal 
block Green functions connecting the central and lead 
regions should be added to the DOS of the central region 
C calculated by g'^^]j{Z). The off-diagonal block Green 
functions can be calculated from g'^^q{Z) and the surface 
Green functions, G''^1^^^{Z) and G''^]^^j^^{Z), by making 
use of the identity Gly"\z){ZS^^'^ - H^^^) = I as follows: 

- -Gi^^(^) {zScs, - H^J:Q G(^,\,,(Z), (31) 
where s is L or i?. 



E. Hartree potential with the boundary condition 

The Hartree potential in the central region is calcu- 
lated under the boundary condition that the Hartree 
potential at the boundary between the central C and 
Li{Ri) regions is same as that of the lead, where the 
Hartree potential in both the lead regions is calculated 
using the conventional band structure calculation before 
the calculation of the infinite chain in Fig. 1(b) using 
the Green function. In our implementation, the Hartree 
potential for the central region with the boundary condi- 
tion is efficiently evaluated by a combination of the two 
dimensional EFT and a finite difference method, while 
other schemes are used in the other implementations i^ii^ 
The majority part of the Hartree potential in our treat- 
ment is accurately calculated by considering the neu- 
tral atom potential which is the sum of the local po- 
tential of the pseudopotential and the Hartree potential 



by the confined charge for the neutralization.^^ The neu- 
tral atom potential depends on only the atomic structure 
and atomic species, and has no relation with the bound- 
ary condition. The effect of the relaxation of charge dis- 
tribution on the Hartree potential is taken into account 
by the remaining minority part of the Hartree potential 
AVh given by 

V2AyH(r) = -47rAn(r), (32) 

where An(r) is defined by the difference between the elec- 
tron density n{r){= X]cr"'^('')) calculated by the Green 
function and the atomic electron density^'^ rt*^°)(r) cal- 
culated by superposition of each atomic electron density 
n'f'\r) at atomic site i as follows: 

An(r) =n(r) -n("'(r). (33) 

The Fourier transformation of Eq. (|32p on the yz plane, 
corresponding to the bc-plane depicted in Fig. 1(a), 
yields 

" " -47rAn(x, G), (34) 

where G = Gbb -I- GcC with integer numbers Gb and Gc- 
By approximating the second derivative in Eq. p4[) with 
the simplest finite difference, 

d^AVn{x„) ^ AVH(x„+i)~2AVH(a:n) + AVH(x„-i) 
dx2 ~ (Aa;)2 

(35) 

we obtain a simultaneous linear equation AAF = i? of 
{Na — 1) X {Na — 1) in size with a tridiagonal matrix 
defined by 

Ann = 2+(A2;)2g2, 

^n(n-l-l) = ^(n+l)n = ^1 (36) 

and a vector defined by 

Bi ^ 47r(Aa;)2An(2;i,G) AVH(a;o,G), 
Bn = An{AxfAfi{xn,G), 
Bn^-1 - 47r(A.x)2An(2;^v„-i,G) + AVk(2:jv„,G), 

(37) 

where Ax is the interval between neighboring points Xn 
and Xn+i, n runs from 1 to Na — 1, and AVyl{xq,G) 
and AT4j(a;Ar„ , G) are the boundary conditions. Since 
the lattice vector of the extended central region C along 
the a-axis is divided by Na for the discretization, the po- 
sitions a;o and xm^ are situated at the boundary, along 
the a-axis, of the unit cell of the central region C. Thus, 
AVH(a;o,G) and AVH(xAr^,G) can be calculated from the 
Hartree potential at the left boundaries along the a-axis 
of the left and right leads, respectively. After solving 
the simultaneous linear equations for all the G points. 
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the inverse two dimensional FFT on the bc-plane yields 
the difference Hartree potential AV^h under the bound- 
ary conditions. It is apparent that there is no ambigu- 
ity for the inclusion of the boundary conditions in the 
method. Although the treatment can be easily extended 
to the higher order finite difference to the second deriva- 
tive, we restrict ourselves to the simplest case in the im- 
plementation because of its sufhcient accuracy. Employ- 
ing FFT for two dimensional Fourier transformation, the 
whole computational effort to solve the Poisson equation 
Eq. (j32p under the boundary conditions is estimated to 
he ^ NaX Nb log(iVb) x Nc log(iVc) which is slightly supe- 
rior to that of the three dimensional FFT, where Na, Nb, 
and Nc are the number of meshes for the discretization 
along the a-, b-, and c-axes. 

F. Hamiltonian and overlap matrix elements 

Our implementation of the NEGF method with DFT 
is based on the strictly localized PAOa^i^i^ and a 
norm-conserving pseudopotential method.— Within the 
scheme, the calculation of the matrix elements such as 
the overlap and kinetic energy integrals, consisting of two 
center integrals, is performed using a Fourier transform 
method,^- while the other matrix elements for T4c and 
At4j, which cannot be decomposed into two center inte- 
grals, are evaluated by the numerical integration on the 
regular mesh in real spacC)^ The further details on how 
the elements of the Hamiltonian and overlap matrices are 
calculated can be found in Ref.^^. 

In addition to the above evaluation of the Hamiltonian 
and overlap matrix elements, the Hamiltonian matrix el- 
ements associated with the basis orbitals situated at near 
the boundary are treated in a special way as explained 
below. If the tails of two basis orbitals located on atoms 
in the central region C go beyond the boundary between 
the central and the lead regions, the associated Hamil- 
tonian matrix element is replaced by the corresponding 
element in the lead region calculated by the conventional 
band structure calculation. The case can happen only if 
the two basis orbitals are located in the region Lo(i?o) 
because of the condition (i) so that the replacement of 
the Hamiltonian matrix element can be always possible. 
The replacement is made by assuming that the poten- 
tial profile near the boundary is similar to that near the 
boundary between the regions and £2(i?2), and 

can be justified if the size of the region Lo{Ro) is large 
enough. 



G. Charge mixing 

Compared to conventional band structure calculations, 
it seems that the NEGF method tends to suffer from 
difficulty in obtaining the SCF convergence. Our obser- 
vation in several cases suggests that the difficulty may 
come from charge sloshing along the a-axis during the 



SCF iteration. The difference Hartree potential AVa 
change largely by imposition of the boundary condition 
even for a small variation in the charge density distri- 
bution, resulting in a serious charge sloshing along the 
a-axis. Thus, we consider suppression of the charge slosh- 
ing along the a-axis by introducing the following weight 
factor w: 

where |Go| is a smaller one of either |6| or |c|, and kq and 
Ki are adjustable parameters, while keeping kq < ki. 
The prefactor g{xi) is given by 

with definitions: 

i-l 

dL{x^) = ^AnH(xfc,G = 0), (40) 

k=0 

Na-1 

dR(x,) = AnH(xfc,G = 0), (41) 

fe='i+i 

where ^ is an adjustable parameter. Noting that 
AnH(a^fc7 G = 0) is the number of difference electron den- 
sity of each layer indexed by fc, and that the Coulomb 
potential induced by each charged layer depends lin- 
early on the distance from the layer, one can notice that 
\dL{xi) — dR{xi)\ is proportional to the electric field at 
position i. Therefore, Eq. pS]) takes charge density under 
a large electric field into significant account in addition to 
the suppression of the charge sloshing in the bc-plane in 
a sense by the Kerker method.^** In our implementation, 
the weight factor given by Eq. (|38p is combined with the 
Kerker method^ and the residual minimization method 
in a direct inversion iterative subspace (RMM-DIIS)^ 
with substantial improvement. 

A technical remark should also be added to avoid a 
local trap problem in the SCF calculation. In systems 
having a long a-axis, a unphysical charge distribution, 
corresponding to a large charge separation in real space, 
tends to be obtained even after achieving the self consis- 
tency. In this case the Hamiltonian of the central region 
C at the first SCF iteration, which are calculated via su- 
perposition of atomic charge density, is far from the self- 
consistently converged one, while the Hamiltonian matrix 
used in the calculation for the self-energy is determined in 
a self-consistent manner beforehand. The inconsistency 
between the two matrices tends to produce a unphysi- 
cal charge distribution at the first SCF iteration. Once 
the situation happens at the first SCF iteration, in many 
cases the electronic structure keeps trapped during the 
subsequent SCF iteration, which is a serious problem in 
practical applications. However, the local trap problem 
can be overcome by a simple scheme that the first few 
SCF iterations are performed by using the conventional 
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band scheme and then onward the solver is switched from 
the band scheme to the NEGF method. In the band 
structure calculation for the first few iterations, such a 
unphysical charge distribution does not appear due to no 
self-energy involved. In most cases we find that the sim- 
ple scheme works well to avoid the local trap problem in 
the SCF convergence. 

III. NONEQUILIBRIUM STATE 

A. Nonequilibrium density matrix 

Based on the NEGF theory mainly developed by 
Schwinger— and Keldysh?, the density matrix in the 
nonequilibrium state of the central region is evaluated 

]j. .,17.18.21.22 

(ncq) _ (cq) , a /ac)-. 

In addition to the equilibrium density matrix p^j'^ given 
by Eq. (P^ . a correction term defined by 

Ap,,R„ - / dfc3Ap^'')e-*-^" (43) 

is taken into account, where Ap^^ is defined by 
1 

Api-^) = - / dEG'•^liE + ^e)^(J:l{E) 

xG^^l.{E~ie)Af{E) (44) 

with 

T(^l (E) = ^ (e^^I (E + ^e) - E^^l (E - (45) 

and 

Af{E)=f{E-p,,)-f{E-p,,). (46) 

Either the left p^ or right chemical potential pR which 
is lower than the other is used for the calculation of 
the equilibrium density matrix in Eq. (|42| . As well, in 
Eqs. (|44l) and (|46p the chemical potentials are given by 
the rule that si ~ R and S2 = L \i pi^ < pj^, and 
si = L and S2 = R ii pr < pL- Starting from the 
NEGF theory, the formula, Eq. ([42]) . may be derived by 
introducing two assumptions. The first assumption is 
that the occupation of the wave functions incoming from 
the left (right) region still obeys the Fermi-Dirac func- 
tion with the left (right) chemical potential even in the 
central region. The assumption can be justified within 
at least the one-particle picture, since the same result 
can be obtained from the Lippmann-Schwinger equa- 
tion for a non-interacting system . ^'^"^i"^^ The second as- 
sumption is that in the central region the states in the 
energy regime below the lower chemical potential is in 
equilibrium due to the other physical obstacles such as 



the electron-phono n^^'^^i'^^i'^^'^^i'^^ and electron-electron 
interactions r^i^i^^i^ which are not considered explic- 
itly in our implementation, although the definite role of 
those obstacles is obscure as for the occupation of the 
states. The second assumption allows electrons to oc- 
cupy in highly localized states below the lower chemical 
potential through the first term of Eq. I^^ . Only the 
states in the energy regime between two chemical po- 
tentials are treated as in the nonequilibrium condition 
in which the contribution of the wave functions incom- 
ing from the lead with the higher chemical potential is 
taken into account to form the correction term given by 
Eq. (gll). 

The integrand in Eq. (|44p is not analytic apart from 
the real axis, since the integrand is a function of both 
Z{= E + ie) and Z*. Thus, one cannot apply the con- 
tour integration method that we use for the equilibrium 
density matrix. Instead, a simple rectangular quadrature 
scheme is applied to the integration of Eq. on the 
real axis with a small imaginary part e. Since the inte- 
grand contains the difference between two Fermi-Dirac 
functions, the energy range for the integration can be ef- 
fectively reduced to a narrow range that the difference 
is larger than a threshold, where the threshold of 10"^^ 
is used in this study. With the threshold and the step 
width of O.OI (eV), the number of meshes on the real 
axis is 152 for \pL - pr\ = 0.1 (eV) at T = 300 K. The 
convergence speed depends on the shape of the integrand 
and how large e is employed for smearing the integrand, 
which will be discussed later. 



B. Source-drain and gate bias voltages 

The source-drain bias voltage applied to the left and 
right leads is easily incorporated by adding a constant 
electric potential Vb to the Hartree potential in the right 
lead region. The effect of the bias voltage appears at 
three places. The first effect is that the Hamiltonian 
matrix in the right region given by Eq. ^ is replaced 
using Eq. (fTO)) as 

Hr^Hr + V^^Sr. (47) 

This can be easily confirmed by noting that the matrix 
elements for the constant potential Vb is V^Sr. The 
off-diagonal block elements H^cLi^ CRi^ ^^'^ 

^a^RiC appearing Eqs. (O and ([8]) are also replaced in 
the same way. The second effect is that the chemical 
potential of the right lead is replaced as 

PR^PR + V^,. (48) 

The treatment is made so that the first replacement can 
be regarded as just shifting the origin of energy in the 
right lead. The last effect is that the boundary condition 
in Eq. ((37)) is replaced as 

AVh{xn^,G)^AV{i{xn^,G), (49) 
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where AV^{xn^ , G) is calculated by the Fourier transfor- 
mation on the bc-plane for AVh at xjv^ plus Vb- Since 
only the difference of the bias voltages applied to the left 
and right leads affects the result, one can consider the 
replacements on only the right lead at the three places 
as shown above. It is noted that the replacement by 
Eq. (j49p corresponds to adding a linear potential ax + b 
to the Hartree potential in the central region C, where 
a and b are determined by the boundary conditions: 
AV^{xN^,G) and AFh(xo,G). 

In our implementation, the gate voltage Vg{x) is 
treated by adding an electric potential defined by 



Vg{x) - exp 



(50) 



where Vg"'' is a constant value corresponding to the gate 
voltage, Xc the center of the region Cq, and d the length 
of the unit vector along a-axis for the region Cq. Due 
to the form of Eq. ([50]) . the applied gate voltage affects 
mainly the region Co in the central region C. The electric 
potential may resemble the potential produced by the 
image charges <^ 



Transmission and current 



IV. NUMERICAL RESULTS 

A. Computational details 

All the calculations in this study were performed 
by the DFT code, OpcnMX.f^-* The PAOs centered on 
atomic sites are used as basis functions i^i^i^ The 
PAO basis functions we used, generated by a confine- 
ment scheme)^2ii^ are specified by H5.5-s2, C4.5-s2p2, 
O5.0-s2p2dl, Fe5.0-s2p2dl, Mg5.5-s2p2, La7.0-s3p2dlfl, 
Sr7.0-s3p2dlfl, and Mn6.0-s3p2d2, where the abbrevi- 
ation of basis functions, such as C4.5-s2p2, represents 
that C stands for the atomic symbol, 4.5 the cutoff ra- 
dius (bohr) in the generation by the confinement scheme, 
and s2p2 means the employment of two primitive orbital 
for each of s- and p-orbitals. Norm-conserving pscudopo- 
tentials are used in a separable form with multiple pro- 
jectors to replace the deep core potential into a shallow 
potential!^ Also, a local density approximation (LDA) to 
the exchange-correlation potential is employed. while 
a generalized gradient approximation (GGA)^^ is used 
only for calculations of the LaMnOa/SrMnOa super lat- 
tice. The real space grid techniques are used with the 
cutoff energies of 120-200 Ry in numerical integrations 
and the solution of Poisson equation using FFT^^ In ad- 
dition, the projector expansion method is employed in 
the calculation of three-center integrals for the deep neu- 
tral atom potentialsi^ 



The spin resolved transmission is evaluated by the Lan- 
dauer formula for the non-interacting central region C 
connected with two leads: 



T^iE) = 1 / dfc^Tf )(£;), (51) 
Vc Jbz 



where T^^\e) is the spin and k-resolved transmission 
defined by 

Tf)(£;) = Tr\T^^l{E)G^^liE + ^e) 



^^%iE)G':M^^^e)\. (52) 

Using the transmission formula, the current is evaluated 
by 



dET^{E)Af{E). 



(53) 



The formula can be derived by starting from a more 
general formula of the current for the interacting cen- 
tral region C and by replacing the involved Green func- 
tions with the non-interacting Green functions as shown 
by Meir and Wingreenj^ We perform the integration in 
Eq. (|53p on the real axis with a small imaginary part e by 
the same way as for the nonequilibrium density matrix 
of Eq. dMl). 



B. Convergence properties 

The accuracy and efficiency of the implementation are 
mainly determined by the evaluation of density matrix 
given by Eq. (|42l) which consists of two contributions: the 
equilibrium and non-equilibrium terms given by Eqs. (|22p 
and (j43|l . respectively. In this subsection, we discuss 
the convergence properties of the equilibrium and non- 
equilibrium terms in the density matrix as a function of 
numerical parameters. 

Since the majority part of the density matrix given by 
Eq. (|42p is the equilibrium contribution, let us first dis- 
cuss convergency of the equilibrium density matrix as a 
function of poles. The absolute error in {Etot — Egcu) of 
a carbon linear chain is shown in Fig. 2(a) as a function 
of the number of poles in order to illustrate the conver- 
gence property for the equilibrium density matrix under 
zero bias voltage, where (i^tot ~ ^-scif) can be regarded 
as a conventional expression of the total energy in DFT, 
and the definitions of the two energy terms i?tot and i?soif 
with the Fermi-Dirac function are given in Appendix A. 
For comparison the result calculated by a closed contour 
method is also showUfi^iS^ One can see that the accu- 
racy of 10^^ eV per atom is obtained using 140, 100, and 
70 poles for the electronic temperature of 300, 600, and 
1200 K, respectively, while about 400 energy points are 
needed to obtain the same accuracy using the closed con- 
tour method at 600 K.^^ For most cases, we find that the 
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FIG. 2: (Color online) (a) Absolute error in (-£/tot — ^'scif ) per 
atom of a linear carbon chain with a bond length of 1.4 A un- 
der the zero bias voltage at electronic temperature of 300, 600, 
and 1200 K, where the regions Lq, Ro, and Co contain four 
carbon atoms, respectively. For comparison the same calcu- 
lation ( closed contour) using a closed contour metho d^^'^^ is 
also shown for T=600 K. The definitions of Etot and -Esoif are 
found in Appendix A. The reference values are obtained from 
calculations with a large number of poles, (b) Total DOS of 
the carbon linear chain, calculated by the conventional band 
structure calculation (solid line) and the EGF method (dot- 
ted line), under the zero bias voltage at 300 K, where 160 
poles are used for the integration of the equilibrium density 
matrix. It is hard to distinguish two lines due to the nearly 
equivalent results. 
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FIG. 3; (Color online) (a) Absolute error in (i?tot — ^'scif) 
per atom of the same linear carbon chain as in Fig. 2, calcu- 
lated by various imaginary parts, under a finite bias voltage 
of 0.5 V, where the other calculation conditions are same as 
for Fig. 2(b). The number pointed by the arrow denotes a 
grid spacing (eV) corresponding to the number of grid points. 
The reference values are obtained from calculations with a 
large number of grid points, (b) MuUiken populations in the 
carbon chain. The sequential numbers 1 and 12 correspond to 
the most left and right hand side atoms in the central region 
C, respectively. 



convergence rate is similar to the case shown in Fig. 2(a). 
In general, the number of poles to achieve the accuracy 
of 10^^ cV per atom must be proportional to the inverse 
of T, since the interval between neighboring poles of the 
continued fraction given by Eq. ([25|) is scaled by fceT. 
This fact implies that the computational effort increases 
as the electronic temperature decreases. However, we 
generally use electronic temperature from 300 to 1000 K 
for practical calculations, which means that the use of 
100 poles is enough for practical purposes. Therefore, it 
can be concluded that the most contribution of the den- 
sity matrix can be very accurately evaluated with a small 
number of poles, i.e., 100. 

To demonstrate the proper treatment of the boundary 
between the lead and the central regions in our imple- 
mentation, in Fig. 2(b) we show a comparison between 
the conventional band structure and the EGF calcula- 



tions with respect to DOS of the carbon linear chain. 
The comparison provides a severe test to check whether 
the EGF method is properly implemented or not. It can 
be confirmed that DOS calculated by the EGF method is 
nearly equivalent to that by the conventional band struc- 
ture calculation, which clearly shows the proper treat- 
ment of the boundary between the lead and the central 
regions in our scheme. 

As explained before, the integration of Eq. (UH) re- 
quired for the evaluation of the nonequilibrium term 
in the density matrix has to be performed on the real 
axis with a small imaginary part because contour in- 
tegration schemes may not be applied due to the non- 
analytic nature of the integrand. The treatment might 
suffer from numerical instabilities in the SCF iteration, 
since the integrand can rapidly vary due to the existence 
of poles of Green function located on the real axis. A 
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remedy to avoid the numerical problem is to smear the 
Green function by introducing a relatively large imagi- 
nary part.— ii^i^iiS^ 



To investigate convergency of the nonequilibrium term 
in the density matrix, in Fig. 3(a) we show the abso- 
lute error in (i?tot — -E'soif) of the same infinite carbon 
chain as in Fig. 2 but under a finite bias voltage of 0.5 
eV as a function of the number of regular grid points 
used for the evaluation of the nonequilibrium term in 
the density matrix given by Eqs. pS)) and ([33]). We also 
tested the Gauss-Legendre quadrature for the integra- 
tion of the nonequilibrium term besides the integration 
using the regular grid, but found that the convergence 
rate of the Gauss-Legendre quadrature is rather slower 
than the simple scheme possibly due to the spiky struc- 
ture of the integrand. Thus, we have decided to use the 
simple scheme using the regular grid. As expected, it 
turns out that the number of grid points to achieve the 
accuracy of 10~^ eV per atom increases as the imaginary 
part becomes smaller. However, the accuracy of 10~^ eV 
is attainable using about 100 grid points in case of the 
imaginary part of 0.01 eV, while a few thousands grid 
points have to be used to achieve the same accuracy for 
the imaginary part of 0.0001 eV. 
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Although the accuracy of 10~* eV can be achieved by 
introducing the smearing scheme, however, one may con- 
sider that results can be affected by the introduction of an 
imaginary part. In order to find a compromise between 
the accuracy and efficiency, the MuUiken population of 
the carbon linear chain under the finite bias voltage of 
0.5 eV is shown in Fig. 3(b). We see that the use of the 
imaginary part of 0.01 eV gives a result comparable to 
that obtained by the use of 0.0001 eV. Thus, the imag- 
inary part of 0.01 eV can be a compromise between the 
accuracy and efficiency in this case. As a result, one can 
find that the energy points of 200 (100 and 100 for the 
equilibrium and nonequilibrium density matrices, respec- 
tively) are enough to achieve the accuracy of 10~® eV per 
atom in (i?tot — Esc\i) at 600 K. However, it should be 
mentioned that a proper choice of the imaginary part 
must depend on the electronic structure of systems, and 
the careful consideration must be taken into account es- 
pecially for a case that the spiky DOS appears in between 
two chemical potentials of the leads, while for the equilib- 
rium part of the density matrix the convergence property 
is insensitive to the electronic structure. 

We also note that the accurate evaluation of the den- 
sity matrix makes the SCF calculation stable even under 
a finite bias voltage. In fact, for the case with the bias 
voltage of 0.5 V the number of the SCF iterations to 
achieve the residual norm of 10~^^ for the charge density 
difference is 29 which is nearly equivalent to that, 30, for 
the zero bias case. 



FIG. 4: (Color online) (a) Currents of the linear carbon chain 
calculated by the SCF calculations (solid line) and the inter- 
polation scheme (dotted line), (b) Transmission of the lin- 
ear carbon chain under a bias voltage of 0.3 V, calculated by 
the SCF calculations (solid line) and the interpolation scheme 
(dotted line). The imaginary part of 0.01 and the grid spacing 
of 0.01 eV are used for the integration of the nonequilibrium 
term in the density matrix. The other calculation conditions 
are same as for Fig. 2(b) 



C. Interpolation of the effect by the bias voltage 

Since for large-scale systems it is very time-consuming 
to perform the SCF calculation at each bias voltage, here 
we propose an interpolation scheme to reduce the compu- 
tational cost in the calculations by the NEGF method. 
The interpolation scheme is performed in the following 
way: (i) the SCF calculations are performed for a few 
bias voltages which are selected in the regime of the bias 
voltage of interest, (ii) when the transmission and cur- 
rent are calculated, a linear interpolation is made for the 
Hamiltonian block elements, h'^^\ and H^^},, of the cen- 
tral scattering region and the right lead, and the chemical 
potential, ii^i , of the right lead by 

= XH!^^^ + {l-X)Hi^^\ (54) 

- ^h^j:h' + ~ (55) 

m = x^l^^^ + [1 - x)ti^^\ (56) 
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FIG. 5: (Color online) 8-ZGNR with (a) a FM junction and 
(b) an AFM junction together with the spatial distribution 
of the spin density at the source-drain bias voltage Vb = 
V. The zigzag edges are terminated by hydrogen atoms. The 
isosurface value of |0.002| is used for drawing the spin density. 



where the indices 1 and 2 in the superscript mean that 
the quantities are calculated or used at the corresponding 
bias voltages where the SCF calculations are performed 
beforehand. Note that it is also possible to perform the 
interpolation for k-independent Hamiltonian matrix ele- 
ments instead of Eqs. and ([55)) . In general, A should 
range from to 1 for the moderate interpolation. A com- 
parison between the fully self consistent and the interpo- 
lated results is shown with respect to the current and 
transmission in the linear carbon chain in Figs. 4(a) and 
(b). In this case, the SCF calculations at three bias volt- 
ages of 0, 0.5, and 1.0 V are performed, and the results at 
the other bias voltages are obtained by the interpolation 
scheme. For comparison we also calculate the currents 
via the SCF calculations at all the bias voltages. It is 
confirmed that the simple interpolation scheme gives no- 
tably accurate results for both the calculations of the cur- 
rent and transmission. Although the proper selection of 
bias voltages used for the SCF calculations may depend 
on systems, the result suggests that the simple scheme 
is very useful to interpolate the effect of the bias voltage 
while keeping the accuracy of the calculations. 



D. Applications 

1. Zigzag graphene nanoribbons 

As an illustration of our implementation, we investi- 
gate transport properties of zigzag graphene nanoribbons 
(ZGNRs) with different magnetic configurations. A char- 
acteristic feature in the band structure of ZGNR is the 
appearance of flat bands around X-point near the Fermi 
level, resulting in spin-polarization of associated states 
located at the zigzag edgesJSSiSZ Thus, so far several in- 
triguing transport properties have been theoretically pre- 
dicted especially for ZGNRs among GNRs by focusing 
on the spin polarized edge statesJ^-ii^i^iSaiZaiZiiZaiSiZiiZ^ 
For instance, it is found that ZGNRs might exhibit an 
extraordinary large magnetoresistance (MR) effect and a 



spin polarized current 

Here we focus on the current-bias voltage (I-Vh) char- 
acteristic of 7- and 8-ZGNRs with two magnetic con- 
figurations: ferromagnetic (FM) and antiferromagnetic 
(AFM) junctions as shown in Figs. 5(a) and (b), respec- 
tively, where the number, 7 or 8, is the number of car- 
bon atom in the sublattice being across ZGNR along the 
lateral direction. The odd and even cases will also be 
referred to as asymmetric and symmetric, respectively. 
The extended central region C consists of one sublat- 
tice and four unit cells, and contains 72 and 82 atoms 
for 7- and 8-ZGNRs, respectively. The poles of 100 is 
used for the evaluation of the equilibrium density ma- 
trix with the electronic temperature of 300 K, while the 
noncquilibrium term in the density matrix is evaluated 
using the simple quadrature method with the imaginary 
part of 0.01 eV and the grid spacing of 0.02 cV. The geo- 
metric structures used are optimized under the periodic 
boundary condition until the maximum force is less than 
lO^"' hartree/bohr. At each bias voltage the electronic 
structure of ZGNR is self-consistently determined. 

Figures 6(a) and (b) show the current-voltage (I-Vh) 
curves for the up- and down-spin states in the FM junc- 
tions of 7- and 8-ZGNRs. It is found that the current 
for 7-ZGNR linearly depends on the bias voltage, while 
the current for 8-ZGNR is saturated at the bias voltage 
of about |0.5|. The distinct behavior of 8-ZGNR from 7- 
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FIG. 6: (Color online) 7-Vb curves for (a) the up-spin and (b) 
the down-spin states in the FM junction, and (c) the up-spin 
and (d) the down-spin states in the AFM junction of 7- and 
8-ZGNRs. 
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FIG. 7: (Color online) Spin moments of the edge carbons in 
the central region C for the FM junction of 8-ZGNR under 
various applied source-drain bias voltages Vh- 



ZGNR can be more definitely seen in the AFM junction 
as shown in Figs. 6(c) and (d). Interestingly, 8-ZGNR 
with the AFM junction exhibits a diode behavior for the 
spin resolved current. Only the up-spin state contributes 
substantially to the current in the negative bias regime. 
In contrary in the positive bias regime only the down-spin 
state contributes to the current. It is worth mentioning 
that the effect can also be regarded as a spin filter effect. 
On the other hand, the /-Vt, characteristics for 7-ZGNR 
is nearly equivalent to that for the FM junction. The con- 
siderable difference between 7- and 8-ZGNRs in the I-Vh 
characteristics can be attributed to the symmetry of two 
wave functions, tt and tt* states, around the Fermi level. 
For 8-ZGNR the wave functions of the tt and tt* states 
are antisymmetric and symmetric with respect to the a 
mirror plane which is the mid plane between two edges, 
respectively, while those wave functions for 7-ZGNR are 
neither symmetric nor antisymmetric. From a detailed 
analysis2^, it can be concluded that the unique distinc- 
tion in the I-Vh characteristics arises from an interplay 
between the symmetry of wave functions and band struc- 
tures of ZGNRs. In addition, we find that spin moments 
are reduced by applying the finite bias voltage as shown 
in Fig. 7. Since the flat bands around X-point of the 
minority spin state are located about 0.25 eV above the 
Fermi level, the spin moments at the zigzag edges are 
largely reduced by increasing the occupation of the flat 
bands for the minority spin states when the bias voltage 
exceeds the threshold as shown in Fig. 7.— '-'^- The details 
of the analysis on the unique spin diode and filter effect 
of ZGNRs is discussed elsewheref^^ 



2. Fe/MgO/Fe tunneling junction 

The applicability of our implementation to bulk sys- 
tems is demonstrated by an application to a tunneling 
junction consisting of MgO(lOO) layers sandwiched by 



iron. The magneto-tunneling junction was theoretically 
predicted to exhibit a large tunneling-magnetoresistance 
(TMR)ri^ and subsequently the TMR effect has been 
experimentally confirmed.-- We consider four MgO(lOO) 
layers sandwiched by iron with the (100) surface of which 
atomic configuration is shown in Fig. 8(a), where the lat- 
tice constant of the bc-plane used is 2.866 A, and they 
are 2.866 and 4.054 A in iron and MgO regions along the 
a-axis, and the distance between the MgO and iron layers 
is assumed to be 2.160 A. The four MgO(lOO) layers cor- 
responds to the region Co in Fig. 1(a), and four Fe layers 
of the left and right hand correspond the region Lq and 
Rq, respectively. The SCF calculations were performed 
at 1000 K under zero bias voltage using k-points of 7 x 7 
and 130 poles for the integration of the equilibrium den- 
sity matrix. It is found that obtaining the SCF is much 
harder than the case of ZGNR discussed before, and that 
a careful and modest treatment for the charge mixing is 
required. 

As shown in Fig. 8(b), the net charge of iron atoms in 
the interfacial layer is positive due to the coordination 
to an oxygen atom, and the reduction of electrons leads 
to the increase of the magnetic moment of iron atoms 
at the interfacial layer. The increase of the magnetic 
moment at the interfacial layer makes the distinction of 
the majority and minority spin states at the Fermi en- 
ergy clear, i.e., the nature of the majority and minority 
spin states at the Fermi energy can be assigned to s- 
and d-states, respectively. The k-resolved transmissions 
at the chemical potential for the majority and minority 
spin states for the parallel magnetic configuration be- 
tween the left and right leads are shown in Figs. 9(a) 
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FIG. 8: (Color online) (a) Calculation model for a tunneling 
junction consisting of four MgO (100) layers, being the Co re- 
gion, sandwiched by Fe(lOO), being the Lq and Rn regions. 
The red large and small and blue circles denote Fe, O, and 
Mg atoms, respectively, (b) The net charge and spin mag- 
netic moment of a Fe or Mg atom belonging to each layer in 
the parallel magnetic configuration between the left and right 
leads. The position in the horizontal axis exactly corresponds 
to that of the layer in the above figure. 
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FIG. 9: (Color online) k-resolved Transmission at the chem- 
ical potential for (a) the majority spin state of the parallel 
configuration, (b) the minority spin state of the parallel con- 
figuration, and (c) a spin state of the antiparallel configura- 
tion, respectively. For the calculations k-points of 120 x 120 
were used. 



and (b), respectively. The large peak at the F point 
in the majority spin state can be attributed to the s- 
state, while sharp peaks around four pillars come from 
the d-state as discussed in Ref.''^. Note that the posi- 
tion of the sharp peaks is rotated by 45 degrees because 
of the unit cell rotated by 45 degrees compared to that 
in Ref^. The conductances, g'^I- and gJ^L for the 
majority and minority spin states, calculated from the 
average transmission integrated over the first Brillouin 
zone, are 11.99 and 2.82 {H.^^ fim~^), respectively, which 
implies that the tunneling junction may behave as a spin 
filter. The distinction in the conductance should be at- 
tributed to decay properties of states in the insulating 
MgO region coupled with the two stateSf2S For the an- 
tiparallel magnetic configuration the k-resolved transmis- 



sion at the chemical potential is understood as a multi- 
plication of the transmissions for the majority and mi- 
nority spin states in the parallel configuration as shown 
in Fig. 9(c). The conductance, G^^^\ of the antipar- 
allel magnetic configuration is 0.34 (f2~^/im~^) which 
is smaller than those of the parallel case. By defining 
TMR= (G'|,P^-HG|^i^-2G('^P))/(2G'('^P)), we obtain TMR 
of 2082 %, which is compared to 3700 % for a 5 layer MgO 
case reported in Refi^. 



3. LaMnO-i / SrMnO'i superlattice 

When the transmission of a system with the period- 
icity along the a-axis as well as the periodicity of the 
bc-plane is evaluated under zero bias voltage, it can be 
easily obtained by making use of the Hamiltonian cal- 
culated by the conventional band structure calculation 
without employing the Green function method described 
in the paper. This scheme enables us to explore trans- 
port properties for a wide variety of possible geometric 
and magnetic structures with a low computational cost, 
and thereby can be very useful for many materials such 
as supperlattice structures. Once the Hamiltonian and 
overlap matrices are obtained from the conventional band 
structure calculation for the periodic structure, the trans- 
mission is evaluated by Eq. ([5T|) , where all the necessary 
information to evaluate Eq. (|5ip can be reconstructed 
by the result of the band structure calculation. As an 
example of the scheme, we calculate the conductance 
of a (LaMn03)/(SrMn03) superlattice with four differ- 
ent magnetic structures, i.e., ferromagnetic. A- type, G- 
type, and C-type antiferromagnetic configurations of Mn 
sites 

In recent years, it has been found experimentally 
that the superlattice structures consisting of LaMnOa 
and SrMnOs layers exhibit a metal-insulator transition 
in terms of the layer thickness.-- Bhattacharya et al. 
fabricated (LaMn03)2n/(SrMn03)„ superlattices on a 
SrTiOa (001) substrate, and measured the in-plane re- 
sistivity as a function of temperaturcj^ The resistivity 
measurement indicates that the superlattices are metal- 
lic and insulating for n < 2 and n > 3, respectively. 
On the other hand, the (LaMn03)2/(SrMn03)2 super- 
lattices fabricated on the same substrate by Nakano et 
al. exhibit a sample dependence in the resistivity mea- 
surement, i.e., one of three samples is metallic and the 
others are insulatingj^ They argued that the metallic 
behavior observed in the one sample may be attributed 
to a certain structural incompleteness in the superlattice 
structure, and that the ideal superlattice should become 
insulating based on their experimental results. A theoret- 
ical model calculation for the (LaMn03)2n/(SrMn03)„ 
superlattices suggests that the metal-insulator transi- 
tion at n = 3 can be explained by existence of the 
G-type antiferromagnetic barrier in the SrMn03 lay- 
ers sandwiched by the LaMnOs layers with the ferro- 
magnetic configuration..®'* Since the analysis by Nakano 
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FIG. 10: (Color online) (a) Optimized geometric structure 
of (LaMn03)/(SrMn03) superlattice with the ferromagnetic 
spin configuration of Mn sites. The middle sized blue and 
small red circles denote Mn and O atoms, while Sr and La 
atoms are denoted by the mark. Density of states (DOS) of 
the (LaMn03)/(SrMn03) superlattice for (b) ferromagnetic 
and (c) A-type antiferromagnetic configurations. 



et al. suggests that the charge transfer between the 
LaMnOa and SrMnOa layers is rather localized in the 
vicinity of the interfacCf^^ the model should be applica- 
ble to the case of (LaMn03)2/(SrMn03)2 without sig- 
nificantly depending on the ratio between the thick- 
nesses of (LaMnOs) and (SrMnOs) layers, indicating 
that (LaMn03)2/(SrMn03)2 is metallic. However, the 
naive consideration evidently contradicts the experimen- 
tal result,2^ 

As a first step towards comprehensive understanding 
of transport properties of the superlattice structures by 
the first principle calculations, we consider the simplest 
superlattice, i.e., (LaMn03)/(SrMn03). In the calcula- 
tions, the in-plane lattice constant is fixed to be 3.905 
A which is equivalent to that of the SrTi03 substrate. 
The out-of-plane lattice constant is assumed to be 7.735 
A, since those are experimentally determined to be 3.959 
A and 3.776 A for the LaMn03 and the SrMn03 layers, 
respectively, grown on the SrTi03 substrate and the av- 
erage out-of-plane lattice constant for the superlattices 
is nearly equivalent to the average of the two values.— 
With those lattice constants internal structural parame- 
ters are optimized for each magnetic configuration with- 
out any constraint until the maximum force is less than 



TABLE I: Total energy (meV) per formula unit, 
LaMn03/SrMn03, and conductance G fj,m~^) of 

the (LaMn03)/(SrMn03) supperlattice with four different 
magnetic configurations, i.e., ferromagnetic (F), A-type (A), 
G-type (G), and C-type (C) antiferromagnetic configurations 
of Mn sites. The total energy is measured relative to that of 
the ferromagnetic configuration. G|,in is the in-plane con- 
ductance for the up spin state, and the others are construed 
in the similar way. For the conductance calculations k-points 
of 60 X 60 were used. 
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2.0 X 10"'^ hartree/bohr. The optimized structure for the 
ferromagnetic configuration is shown in Fig. 10(a). It is 
found that the position of oxygen atoms is largely dis- 
torted due to the different ionic radii between La and Sr 
atoms, showing that Mn atoms are located in the center 
of each distorted octahedron. Also, bond angles of Mn- 
0-Mn are found to be 167.4 and 161.6 (degree) for the in- 
plane and out-of-plane, respectively. The total energies 
relative to the ferromagnetic configuration are listed in 
Table |T1 The calculated ground state is the ferromagnetic 
configuration, and the A-type antiferromagnetic configu- 
ration lies just above 5 meV per formula unit. It may be 
considered that the nearly degeneracy between the two 
configurations corresponds to the neighborhood of the 
boundary at x = 0.5 and c/a — 1 in the phase diagram 
for the tetragonal Lai_2;Sra;Mn03.— The two configura- 
tions have both a metallic DOS, while the ferromagnetic 
configuration is half-metallic as shown in Figs. 10(b) and 
(c) , reflecting the large in-plane and out-of-plane conduc- 
tances as shown in Table HI From a detailed analysis (not 
shown) of DOSs, we see that the electronic states at the 
Fermi level are composed of eg orbitals of Mn atoms and 
p orbitals of oxygen atoms. Also, the MuUiken popula- 
tion analysis (not shown) suggests that the charge state 
of Mn atoms in the superlattice is in between those in 
the LaMn03 and SrMn03 bulks. This implies that the 
metallic band structures are induced by partial filling of 
the eg bands. Since the bond angle of Mn-O-Mn for 
the out-of-plane is slightly acute, therefore, this may be 
attributed to reduction of the conductance of the ferro- 
magnetic state in the out-of-plane as shown in Table [H 
A systematic study for the thicker cases and the effect of 
Coulomb interactions^, in the eg orbitals is highly desir- 
able, and the details will be discussed elsewhere. 
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FIG. 11; (Color online) Speed-up ratio in the parallel compu- 
tation of the calculation of the density matrix for the FM junc- 
tion of 8-ZGNR by a hybrid scheme using MPI and OpenMP. 
The speed-up ratio is defined by Ti/Tp, where Ti and Tp are 
the elapsed times by a single core and a parallel calculations. 
The cores used in the MPI and OpenMP parallelizations are 
called process and thread, respectively. The parallel calcula- 
tions were performed on a Cray XT5 machine consisting of 
AMD opteron quad core processors (2.3GHz). In the bench- 
mark calculations, the number of processes is taken to be 
equivalent to that of processors. Therefore, in the paralleliza- 
tion using 1 or 2 threads, 3 or 2 cores are idle in a quad core 
processor. 



E. Parallelization 

The computation in the the NEGF method can be par- 
allelized in many aspects such as k-points, energies in the 
complex plane at which the Green functions are evalu- 
ated, spin index, matrix multiplications, and calculation 
of the inverse of the matrix. Here we demonstrate a good 
scalability of the NEGF in the parallel computation by a 
hybrid scheme using the message passing interface (MPI) 
and OpenMP which are used for inter-nodes and intra- 
node parallelization, respectively. The Green function 
defined by Eq. ^ is specified by the k-point, energy 
Z, and spin index a. Since the calculation of the Green 
function specified by each set of three indices can be inde- 
pendently performed without any communication among 
the nodes, we parallelize triple loops corresponding to the 
three indices using MPI. Each node only has to calculate 
the Green functions for an allocated domain of the set 
of indices, and partly sum up Eq. (pS)) or (|44p in a dis- 
cretized form. After all the calculations finish, a global 
summation among the nodes is required to complete the 
calculation of Eq. or ([33]), which, in most cases, is 
a very small fraction of the computational time even in- 
cluding the MPI communication among the nodes. Thus, 
the reduction of scalability for the parallelization of the 
three indices is mainly due to imbalance in the allocation 
of the domain of the set of indices. The imbalance can 



happen in the case that the number of combination for 
the three indices and the number of processes in the MPI 
parallelization are relatively small and large, respectively. 
In addition to the three indices, one may notice that the 
matrix multiplications and the calculation of the matrix 
inverse can be parallelized, which are situated in the inner 
loops of the three indices. The evaluation of the central 
Green function given by Eq. ^ and the surface Green 
functions given by Eq. pT|) . and the self energies given 
by Eqs. ([7]) and ([5]) are mainly performed by the matrix 
multiplications and the calculation of the matrix inverse. 
We parallelize these two computations using OpenMP in 
one node. Since the memory is shared by threads in the 
node, the communication of the data is not required un- 
like the MPI parallelization. However, the conflict in the 
data access to the memory can reduce the scalability in 
the OpenMP parallelization. As a whole, in our imple- 
mentation the k-point, energy Z , and spin index a are 
parallelized by MPI, and the matrix multiplications and 
the calculation of the matrix inverse are parallelized by 
OpenMP. 

In Fig. 11 we show the speed-up ratio in the elapsed 
time for the evaluation of the density matrix of 8-ZGNR 
under a finite bias voltage of 0.5 eV. The geometric and 
magnetic structures and computational conditions for 8- 
ZGNR are same as before. The energy points of 197 (101 
and 96 for the equilibrium and nonequilibrium terms, re- 
spectively) are used for the evaluation of the density ma- 
trix. Only the F point is employed for the k-point sam- 
pling, and the spin polarized calculation is performed. 
Thus, the combination of 394 for the three indices are 
parallelized by MPI. It is found that the speed-up ratio 
of the flat MPI parallelization, corresponding to 1 thread, 
reasonably scales up to 64 processes. Furthermore, it can 
be seen that the hybrid parallelization, corresponding to 
2 and 4 threads, largely improves the speed-up ratio. By 
fully using 64 quad core processors, corresponding to 64 
processes and 4 threads, the speed-up ratio is about 140, 
demonstrating the good scalability of the NEGF method. 



V. CONCLUSIONS 

We have presented an efficient and accurate implemen- 
tation of the NEGF method for electronic transport cal- 
culations in combination with DFT using pseudo-atomic 
orbitals and pseudopotentials. In the implementation, 
we have developed accurate methods for the evaluation 
of the density matrix and the treatment of the boundary 
between the scattering region and the leads. 

A contour integration method with a continued frac- 
tion representation of the Fermi-Dirac function has been 
successfully applied for the evaluation of the equilibrium 
term in the density matrix, which evidently outperforms 
the previous method,— while a simple quadrature scheme 
on the real axis with a small imaginary part is employed 
for that for the nonequilibrium term in the density ma- 
trix. It has been demonstrated by numerical calculations 
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that the accuracy of 10~^ eV per atom in (i?tot — -2'scif ) is 
attainable using the energy points of 200 in the complex 
plane even under a finite bias voltage of 0.5 V at 600 K. 
However, the evaluation of the nonequilibrium density 
matrix still requires a careful treatment where the pole 
structure of the Green functions has to be smeared by in- 
troducing a finite imaginary part. The numerical calcu- 
lations suggest that the number of energy points required 
for the convergence can be largely reduced by introducing 
the imaginary part of 0.01 eV without largely changing 
the calculated results in a practical sense. We also note 
that the accurate evaluation of the density matrix pro- 
vides another advantage that the SCF calculations even 
under a finite bias voltage smoothly converge in a simi- 
lar fashion as the conventional band structure calculation 
does. 

We have also developed an efficient method for calcu- 
lating the Hartree potential by a combination of the two 
dimensional FFT and a finite difference method without 
any ambiguity in reproducing the boundary conditions. 
In addition, a careful evaluation of the charge density 
near the boundary between the scattering region and the 
leads is presented in order to avoid the spurious scatter- 
ing accompanied by the inaccurate construction of the 
charge density. The proper treatment for the charge eval- 
uation in our implementation can definitely be verified 
by a comparison between the conventional band struc- 
ture calculation and the EGF method with respect to 
the DOS of the carbon chain. 

Finally, we have demonstrated the applicability of 
our implementation by calculations of spin resolved I- 
Vb characteristics of ZGNRs, showing that the /-Vb 
characteristics depend on the symmetry of ZGNR, and 
that the symmetric ZGNR exhibits a unique spin diode 
and filter effect. Also, the applicability of our im- 
plementation to bulk systems is demonstrated by ap- 
plications to a Fe/MgO/Fe tunneling junction and a 
LaMnOa/SrMnOa superlattice. Based on the above dis- 
cussions and the good parallel efficiency in the hybrid 
parallelization shown in the study, it is concluded that 
our implementation of the NEGF method can be appli- 
cable to challenging problems related to large-scale sys- 
tems, and can be a starting point, apart from numerical 
spurious effects, to include many body effects beyond the 
one particle picture in the electronic transport. 
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APPENDIX A: TOTAL ENERGY FUNCTIONAL 
FOR THE EQUILIBRIUM STATE 

Let us introduce the following density functional which 
may define the total energy of the central region C be- 
ing a part of the extended system at equilibrium with a 
common chemical potential /i: 



En 



E. 



Escli, 



(Al) 



where Ecxt is the Coulomb interaction energy between 
electrons and the external potential of the central region 
C given by 



E,. 



dr^n{r)vcKt{r), 



(A2) 



and Ecc is the Hartree energy defined by 

1 f /".,.„ (n(r) + n'(r)) (n(r') + n'(r')) 



En 



dr^dr'^ 



(A3) 



where n' is an additional electron density which arises 
from the boundary condition between the central and 
lead regions. In fact, the additional electron density n' 
can be obtained by back Fourier transforming Eq. (j37p 
divided by 47r(Ax)^ within our treatment. The last term 
of Eq. (jAip . -Escif, is a self energy density functional, 
which may correspond to the energy contribution from 
the self energy due to the semi infinite leads, given by 



-Esoif = Tr 



2 /"^ 

-Im / dEGc{E+)K{E) 

J-oo 



with 



K{E) = S(£;+) - E 



dY.{E^ 
dE 



(A4) 



(A5) 



where E^ = E + iO+, the factor of 2 is due to the spin 
multiplicity, and the self energy E is the sum of the self 
energies arising from the left and right leads. Although 
we neglect the spin and k-dependency on the formula- 
tion for the simplicity throughout the Appendix, its gen- 
eralization with the dependency is straightforward. The 
kinetic energy E^kin can be evaluated by Eq. ([6]) as the 
band energy -Eband minus double counting corrections as 
follows: 
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where i?band is defined by 
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It is noted that the last term in Eq. (jA6p cancels the 
contribution from the first term in Eq. (|A5p . The effec- 
tive potential VcS in the second term of Eq. (|A6P will be 
defined later. Also, the exchange-correlation energy E^c 
in Eq. (|Aip is considered to be a density functional such 
as local density approximations (LDA) and generalized 
gradient approximations (GGA) evaluated using electron 
density n in the central region C. 

We now consider the variation of the total energy Etot 
with respect to n. The variations of E^xt and E^c are 
simply given by 



is easily found as 



,3 / /x'5wcff(r') 



i5ri(r) 
(A12) 



Thus, it turns out using Eqs. (jA8P - (jA12p that the varia- 
tion of the total energy Etot is given by 

•5 [Etot] 



S[Eoxt] = J dr^Sn{r)voxt{r) (AS) = J dr^Sn{r) 



and 



dr^5n(r) / dr 



,^n{r') + n'ir') 
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By noting the Dyson equation and GcSc'Gc = 
Gc^Gc which are both derived from Eq. 
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0), and 



Tt{AB) ~ Tr(i?A), the variation of i?band is given by 
two contributions: 
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Letting 6 [Etot] be zero so that the variation of the total 
energy .Etot can be always zero with respect to n, we 
obtain a form of the effective potential: 



Woff(r) = Wcxt(r) + dr 



,3n(r')+n'(r') , SE^ 



|r — r'l Sn{r) 

(A14) 

It is found that the effective potential takes the same form 
as in the Kohn-Sham method."*^ The fact implies that the 
self consistent solution of the Green function under the 
zero bias condition may correspond to the minimization 
of the total energy functional defined by Eq. (|Aip . since 
in practice the Green function defined by Eq. ([6]) is cal- 
culated using the effective potential given by Eq. (|A14[) 
as a consequence of combining the NEGF method with 
DFT. 

The generalization of the functional to the metallic 
case with a finite temperature and the noncquilibrium 
state might be an important direction in the future study 
so that forces on atoms can be variationally calculated 
from the functional, since the existence of a variational 
functional has been recently suggested for the noncqui- 
librium steady state^li^^i^ 



The trace in the first term of Eq. I|A10|) can be trans- 
formed by considering a partial integral and assuming 
the system to be insulating as follows: 



Tr 



2 

Im 

TT 



\ 6n(r) 



dE 



= -Tr 



Im 

TT 



' dE ( ] Gc{E+) 



6n(r) 



, /3 I ,^'5^'off(r') 
dr n(r I ^ , , . 



(All) 



It is also noted that the second term in Eq. (jAlOp cancels 
the variation of the contribution from the second term in 
Eq. (|A5p . The variation of the second term in Eq. (|A6P 



APPENDIX B: ENERGY DENSITY MATRIX 



The equilibrium energy density matrix e 



(eq) 



where 



one of the associated basis orbitals is in the central cell 
and the other is in the cell denoted by Rn, is calculated 
using the contour integration method applied to the equi- 
librium density matrix as follow: 



-o-.R,, 



1 



c JBZ 



(Bl) 



with 



/oo 
^dEEG';^l{E±tO+)f{E~f,). (B2) 
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If the Hamiltonian and overlap matrices are k- 
independent, as well as Eq. (P^ . Eq. (|B2p can be simpli- 
fied to: 



eafi = Im 



1 f°° 

- dEEG„.c{E + iO+)f{E- fi) 



(B3) 



For the general case with the k-dependent Hamiltonian 
and overlap matrices, Eq. (|B2[) is evaluated by the con- 
tour integration method with the special form of Fermi- 
Dirac function given by Eq. (I25|) as follows: 



(k) 



with 



^ p=l 



p 



70 



(B4) 



(B5) 



where jJ^^'^^ is the first order moment of the Green func- 
tion G^^)j. In Eq. jBH), a term, j^Yimn^oo R^J■^a'°\ 



which appears mutually for e^*^]., is omitted, since the 
diverging terms cancel each other out in Eq. (jBip . By 
making use of the moment representation of the Green 
function^i^ the following simultaneous linear equation is 
derived for the zero and first order moments: 



(B6) 



Letting zq and zi be iR and —R, respectively, the zero 
and first order moments can be evaluated by 




A^r^ = Y^(GS(*i?)-GS(-i?)), (B7) 
/^^'^ = TT^(<c(*^) + G!^c(-i?)), (B8) 



where i? is a large real number so that the higher order 
moments can be neglected. 

For the nonequilibrium Green function, the nonequi- 
librium contribution Aco- in the the energy density ma- 
trix e^a°'^'^ can be calculated using the simple quadrature 
scheme in the same way as for the nonequilibrium term 
in the density matrix. 
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